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Abstract 

In this paper, we develop a hierarchical Bayesian game framework where users compete to 
maximize their throughput by picking the best locally serving radio access network (RAN) with 
respect to their own measurement, their demand and a partial statistical channel state information 
(CSI) of other users. In particular, we investigate the properties of a Stackelberg game, in which the 
base station is a player on its own. We derive analytically the utilities related to the channel quality 
perceived by users to obtain the equilibria. We show by means of a Stackelberg formulation, how the 
operator, by sending appropriate information about the state of the channel, can optimize its global 
utility while users maximize their individual utilities. The proposed hierarchical decision approach 
for wireless networks can reach a good trade-off between the global network performance at the 
equilibrium and the requested amount of signaling. Typically, it is shown that when the network 
goal is orthogonal to user's goal, this can lead the users to a misleading association problem. 

Index Terms 

WLAN, 3G LTE, association problem, misleading information, channel state information, game 
theory, Bayes-Nash equilibrium, Bayes-Stackelberg equilibrium. 
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Fig. 1. The association problem of choice of the access point: Information available when taking a decision. 



I. Introduction 

Efficient design of wireless networks calls for end users implementing radio resource 
management (RRM), which requires knowledge of the mutual channel state information in 
order to limit the influence of interference impairments on the decision making. However, 
full CSI assumption is not always practical because communicating channel gains between 
different users in a time varying channel within the channel coherence time may lead to large 
overhead. In this case, it is more appropriate to consider each channel coherence time as a 
one-stage game where players are only aware of their own channel gains and their opponent's 
channel statistics (which vary slowly compared to the channel gains and, therefore, can be 
communicated (HI). The interaction between the players may be repeated but with a different 
and independent channel realization each time and therefore is not a repeated game. This 
motivates the use of games with incomplete information, also known as Bayesian games fl2), 
which have been incorporated into wireless communications for problems such as power 
control flU and spectrum management in the interference channel [5]. In [H, a distributed 
uplink power control in a multiple access (MAC) fading channel was studied and shown 
to have a unique Nash equilibrium (NE) point. With the same incomplete information, it 
was shown (5]| that in a symmetric interference channel with a one-time interaction, there 
exists a unique symmetric strategy profile which is a NE point. This result however is limited 
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to scenarios where all users statistically experience identical channel conditions (due to the 
symmetry assumption) and does not apply to interactions between weak and strong users. 

In this paper, we generalize the analysis of (6l to the multi-user case and present an 
alternative approach for improving the network efficiency by introducing a certain degree of 
hierarchy between the users and the base station. More specifically, we propose a Stackelberg 
formulation of the association problem when a partial channel state information is assumed at 
the transmitter. By Stackelberg we mean distributed decision making assisted by the network, 
where the wireless users aim at maximizing their own utility, guided by aggregated infor- 
mation broadcasted by the network about the CSI of each user. We first show how to derive 
the utilities of users that are related to their respective channel quality under the different 
association policies. We then derive the policy that corresponds to the Stackelberg equilibrium 
and compare it to the fully cooperative and the non-cooperative model. Technically, our 
approach not only aims at improving the network equilibrium efficiency but has also two 
nice features: (i) It allows the network to guide users to a desired equilibrium that optimizes 
its own utility if it chooses the adequate information to send, (ii) Only the individual user 
demand and a partial statistical CSI of other users is needed at each transmitter. Our approach 
contributes to designing networks where intelligence is split between the base station (BS) 
and mobile stations (MSs) in order to find a desired trade-off between the global network 
performance reached at the equilibrium and the amount of signaling needed to make it work. 
Note that the Stackelberg formulation arises naturally in some contexts of practical interest. 
For example, hierarchy is naturally present in contexts where there are primary (licensed) 
users and secondary (unlicensed) users who can sense their environment because there are 
equipped with a radio 0. It is also natural if the users have access to the medium in an 
asynchronous manner. 

The decision, of which access point to connect to, is typically left to the user. In some 
cases, each of the access points corresponds to that of another operator. In other cases, the 
choice of operator is offered to a user only once it connects to an access point. From a 
practical point of view, the driver for the wireless card typically gives some information 
concerning the channel state at each of the access points. Figure [T] is an example of the 
information presented for a user when the opportunity of taking a decision is offered to him. 
This can be easily thought of as a game situation. This game may involve a preliminary 
decision to which access point (or service provider) to attempt connection. Once an attempt 
is made then the user gets information on the pricing policy of the provider. Note that the 
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user may know the pricing information of one or more providers before making the decision 
since this pricing usually remains the same for a long period. It may discover the quality of 
service offered by an operator only after taking the decision of which of the service providers 
to connect to. In this game, the decisions may depend on the pricing strategy of each service 
provider as well as on the quality of its service. The latter may be unknown, and become 
available only after taking, whereas the former may become available. This is a complex 
stochastic game as each user comes at random points, its decision will be affected by the 
state of the channel not only at the present (i.e. the one it has available) but also at the future, 
the latter will be determined by the decisions of future users. Yet, a user is not aware of 
when future arrivals will occur and what the decisions will be. This game has an unusual 
information: it is partial and misleading. Misleading - because, although the channel state 
indeed can give information on the transmission rate, it is known that the actual throughput 
of a user is a function of not only his channel state but also of that of the other connected 
users [8]. The throughput is known to be lower bounded by the harmonic mean of the rates 
available to each user. The real utility of a user is the throughput he would get and the user 
may not be aware that it is possible that an access point with a better channel may have a 
lower throughput because more terminals are connected to it. 

The association problem may also include choice between several technologies: say be- 
tween 3G LTE, WiFi, bluetooth and Ad-Hoc network. Figure [T] is an example of the infor- 
mation available to a user in a game where one has the option of connecting to an Ad-hoc 
network or to access points with different signal strengths. As before, the information given 
to the user is misleading since the throughput of the user cannot be directly inferred from 
the quality of his channel. These questions are tackled in this work where the association 
problem is modeled as a hierarchical Bayesian game. User strategies are decisions to choose 
to connect to one system or another according to a local information about the quality of the 
channel, the demand and a partial statistical information of other users. The operator controls 
the equilibrium of its wireless users to maximize its own utility by broadcasting appropriate 
information. We first compute the users' utilities and then derive analytically the utilities 
related to the channel quality perceived by the users. 

Related Work 

When we deal with heterogeneous distributed networks, interactions among selfish users 
sharing a common transmission channel can be modeled as a non-cooperative game using 
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the game theory framework [3]. Game theory provides a formal framework for studying the 
interactions of strategic agents. Recently, there has been a surge in research activities that 
employ game theory to model and analyze a wide range of application scenarios in modern 
communication networks fl9), ffTOll . 



The association problem is related in nature to the channel selection problem. We note that 
when a single technology is used or, when the decision concerns the choice of channels of 
a given access point rather than the choice of an access point, one can often exploit simpler 
structure of the decision problem and obtain efficient decentralized solutions. Some examples 
of work in that direction are IfTTll . lfT2ll . [fT3l , [fT4l . [[T5l . The potential inefficiency of such 
approaches in the context of 802.11 networks have been known for a long time. The term 
"performance anomaly has been frequently used for this inefficiency [8]; it describes the 
fact that when some devices use a lower bit rate than the others, the performance of all 
devices is considerably degraded. Such a situation is even more problematic when a device 
attempts to connect to the Internet; it may not be aware that it is possible that an access 
point with a better channel quality may have a lower throughput because more terminals 
are connected to it. In fact, this could likely lead the throughput of all devices transmitting 
at the higher rate degraded below the level of the lower rate. This makes the information 
given to the user misleading since the throughput of the user cannot be directly inferred 
from the quality of his channel. To overcome this hurdle, we introduce a Bayesian game 
theoretic framework with partial CSI to maximize the throughput while taking into account 
the system overload. This study requires particular attention when all users wish to maximize 
their individual throughput but each has a different approach (e.g., users may have different 
tolerance for delay, or may have a certain QoS to guarantee). 



The structure of the paper is as follows. The system model related aspects are described in 
Sec. |Ilj Next in Sec. |ni} we provide a thorough analysis of the Bayes equilibria for both non- 



cooperative and Stackelberg frameworks in the case of two-users: Sec. III-A reviews the main 



results of [6] for the non-cooperative game and Sec. III-B presents the Stackelberg Bayesian 
game framework adopted for the considered association problem. We first show how the 
base station can control the equilibrium of its users by means of a Stackelberg formulation 



and then we derive analytically the utilities of the users and compute equilibria. Sec. III-C 



presents three different evaluation scenarios along with some key performance indicators 
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including price of anarchy. In Sec. [TV} we generalize the results of previous sections to a 



situation where there are more than two symmetric users. Sec. [V] concludes the paper. 

II. System Model 

Consider a network composed of Wifi and 3G LTE. Each user entering in the system will 
decide individually to which of the available systems it is best to connect according to its 
radio condition, its demand and the statistical information about other users. Their policies 
(or strategies) are then based on this (incomplete) information. The association problem is 
then generalized to allow the BS to control the users' behavior by broadcasting appropriate 
information, expected to maximize its utility while individual users maximize their own 
utility. 

We assume that the user state is defined by the pair (hi, 6«) where hi is the downlink 
channel between the WiFi access point (AP) and the terminal and 6, is the demand of user i. 
The action a, is defined by the user decision to connect to a certain radio access technology 
(RAT). The network is fully characterized by the user state. However, when distributing 
the joint radio resource management (JRRM) decisions, this complete information is not 
available to the users. The BS or the AP broadcasts to its terminals an aggregated information 
indicating a measurement of the communication quality of the wireless channel (excellent, 
fair, poor...). This can be done through the Channel Quality Indicator (CQI) which can be a 
value (or values) representing a measure of channel quality for a given channel (see Figure 
[l}. Typically, a high value CQI is indicative of a channel with high quality and vice versa. 
More formally, assume that the knowledge of each user about his own state is limited to the 
pair (si, bi), where s, = H{ft i >* i }, with ^ - a fixed threshold and ll c is the indicator function 
equal to 1 if condition C is satisfied and to otherwise. We will call the "CQI threshold" 
of user i. Thus, a user only knows whether he wants to transmit and whether the channel is 
in a good (s, = 1) or in a bad (sj = 0) condition given the CQI threshold. In addition any 
player has the information about the probability distribution of his own state (sj, 6j) and that 
of his opponent (sj,bj). These are given by a, - the probability to have {hi > ^i}, and fa 
- the probability that bi = 1. 

In the next sections, we provide a thorough analysis of the existence and characterization 
of the Bayes equilibria for both non-cooperative and Stackelberg scenarios. We first focus 
on the two-user case in order to gain insights into how to design decision problem in radio 
environments. Then, we generalize our approach to the multi-user case. 
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III. The two-user case 

The first step before analyzing the Stackelberg Bayesian decision scheme is to define the 
utilities of users. These are often related to throughput, whose variations are mainly due to 
network load, radio network conditions and mobility such as handovers. We assume that the 
3G LTE network throughput is constant equal to v and that there is no interference between 
3G LTE and the WiFi network. Consider the following throughput for each system: 

T . w i (■> , phidibi \ 

Th Pi = log 1 + -z- ; j ? % (1) 



v a 2 + p hj cij bj j 
Thp c = v (2) 

where index W stands for WiFi network and C stands for 3G LTE cellular network. The 
additive noise variance is a 2 and p is the transmit power considered constant. We also assume 
that the distributions of hi are of exponential type [[Toll . Given and ^ we can compute 
that the distribution of hi is Exp(Aj) with 

Given the information that a player has, there are four possible policies of a player i with 
bi — 1 (we do not consider state bi = 0, when there is no transmission of any type): 
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Let us not consider the policy (W,C), which is irrational, as the throughput of a player 
using WiFi when {hi > is certainly higher than that when {hi < ^j}. We then have a 
game with partial CSI with two states and a (3 x 3) matrix in every state. 

Let us denote by P = [Pi, P2] G V the (2 x 2) policy profile matrix defined as the actions 
taken by the two mobiles in low and high channel states, user i's utility in state s = 0, 1 is 
then given by 



if user i chooses C at state s 



u l (s,P)={ (4) 



Cp.(s); if user i chooses W at state s 



The functions C\, describing the utility of player i using WiFi when his opponent applies 
policy k, are defined as follows 

1 r°° 

Cl{l) = E[4(hi)\hi = - 4(hi)X ie - x ^ dh u (5) 

OH 



8 



Cj(0) = n4(hi)\hi < n = JMKe-^ dhi, 

1 - OLi J 



(6) 



with k = WW, CW, CC. 

c % k (hi) above is the utility of player % using W when channel gain is hi against policy k 
of player j. These utilities are defined as follows: 

c\vw(hi) = /°°log(l + 2 ^ , )X j e~ x ^dh j + (7) 

r\og(l + ^)X 3 e- x ^dh 3 
Jo ° 

cbw(hi) = fii f°log(l + -^- J -)X j e-^dh j + 



Next: 



(8) 



(1-/3,) f°log(l + §)A, 



/ x u dh r 



log(l + ^Xje'^dh, 
Jo ° 



Finally: 



c^(^) = log(l + $) (9) 



A. 7?ev/ew of the non-cooperative equilibrium 

Game theory has accentuated the importance of randomized games or mixed games. 
However, such a game does not find a significant role in most communication modems and 
source coding codecs since equilibria where each user randomly picks a decision at each time 
epoch are unfortunately not interesting in such a case, as they amount to perpetual handover 
between networks. In what follows, we will make use of the users' utilities obtained above 
to derive the pure association strategies. 

Definition 1 (Bayes-Nash equilibrium). A strategy profile Pi BNE , \/i = 1,2 corresponds 
to a Bayes-Nash equilibrium (BNE) if, for all users, any unilateral switching to a different 
strategy cannot improve user's payoff at any state. Mathematically, this can be expressed by 
the following inequality, given the statistical information about the other user VQi ^ p i BNE 

Ul ( Sl , (P^V™*)) > Ui(*, (QuP B ? E )) do) 
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for Si = 0, 1. 

Proposition 1. The game considered in the paper always has a pure-strategy Bayes-Nash 
equilibrium. Moreover 

(a) (WW, WW) is an equilibrium iff C l ww (ti) > v for i = 1,2. 

(b) (WW,CW) is an equilibrium iff C^ w {0) > v and C^ w (0) <v< C^ w (l). 

(c) (WW,CC) is an equilibrium iff Cq C (0) > v and C^ vw (l) < v. 

(d) (CW, WW) is an equilibrium iff C^ W (Q) <v< C^ w (l) and C^ w (0) > v. 

(e) (CW,CW) is an equilibrium iff C l cw (0) <v< C l cw (l) for i = 1,2. 

(f) (CW,CC) is an equilibrium iff C£ c (0) < v < C& c (l) and C 2 cw (l) < v. 

(g) (CC, WW) is an equilibrium iff C^ vw (l) < v and Cq C (0) > v. 

(h) (CC,CW) is an equilibrium iff Cq W (1) < v and C^ c (0) < v < Cq C (1). 

(i) (CC,CC) is an equilibrium iff C l cc (l) < v for i = 1,2. 

Proof: The statements (a)-(i) are direct consequences of the definition of Bayes-Nash 
equilibrium and the form of payoff matrices. Next, it is immediate to see that the definitions 
of C l k (s) imply the following inequalities: 



for % = 1,2 and s = 0, 1. Now, using these inequalities, it is tedious but straightforward to 
show that always at least one of the conditions (a)-(i) is satisfied. 

■ 

The next proposition gives us some information on how the Nash-Bayes equilibria depend 
on the chosen values of the CQI thresholds ^j. 

Proposition 2. If^i and \l/ 2 are small enough none of the players uses WW in equilibrium. 
If they are large enough, none of the players uses policy CC in equilibrium. Moreover, for 
all the values of the parameters of the model one of the two possibilities is true: 

(a) For \T/i and \T/ 2 small enough at least one of the players uses policy CC in equilibrium, 

(b) For ^/i and \I/ 2 large enough at least one of the players uses policy WW in equilibrium. 



Note that when ^ ^ and ^ 2 ->■ 0, C£(0)(tf 1, tf 2 ) ->■ and Cj[(l)(*i, tf 2 ) ->■ C£(oo) for 
i = l, 2, k = CC, CW, WW. Analogously, when tfj. ->■ 00 and ^ 2 ->■ 00, C£(0)(*i, * 2 ) ->■ 



Cww( s ) < C l cw (s) < C l cc (s) 



Proof: Define for i — 1,2 and k = CC, CW, WW 




(ID 



to 



C l k (oo) and C£(l)(*i,tf 2 ) ->■ +00. Thus for * x and ^ 2 small enough, C$(0)(*i,tf 2 ) < v 
for all the values of i and fc, which by Proposition [T] implies that no player uses policy WW 
in equilibrium. Analogously for ^ 2 big enough, C^(0)(^i, ^2) > u for all the values of 
i and fc, and thus no player uses CC in equilibrium then. 

Now note that by Proposition [Tj one of the players uses WW in equilibrium iff 

Cb w (0)>v or C 2 cw (0)>v. 

Thus if we take ^2 large enough, we can pass to the limit: 

c cw(°°) > v or Ccw(°°) > w - ( 12 ) 
Analogously, one of the players uses CC in equilibrium iff 

Ch w (l)<v or C 2 cw (l)<v. 

Passing to the limit when \I>2 approach 0, 

Ccw(oo) < ^ or C^(oo) < v. (13) 

However ([12]) and ( [T3] ) cover all the values of t>, ending the proof. 

■ 

Roughly speaking, this proposition means that for higher values of the CQI thresholds \PjS 
the players are more likely to use WiFi rather than 3G LTE and conversely, for low values 
of the CQI thresholds the players are more likely to use 3G LTE rather than WiFi. 
Interestingly, Proposition [2] also suggests that, rather than increasing the offered throughput 
v, the operator could control the equilibrium of its wireless users to maximize its own revenue 
by broadcasting appropriate CQI thresholds. This can lead the network to minimize its overall 
cost and users to a misleading association problem. 

Next, we address a hierarchical approach for choosing the CQI thresholds. 



B. The hierarchical equilibrium 

In this section, we propose a methodology that transforms the above non-cooperative game 
into a Stackelberg game. Concretely, the network may guide users to an equilibrium that 
optimizes its own utility if it chooses the adequate information to send. We first study the 
policy that maximizes the utility of the network, which is defined as the probability that both 
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users use 3G LTE network rather than WiFi: 

£/bs(P,*i,* 2 ) = 

(1 - «i)(l - a2)tt{ Pll= p 21= c} + tti«2U{p 12 =p 22 =c*} 
+ (1 - o;i)q;21I{p 11 =p 22 =c} + - a2)H{p 12 =p 21 =c*} 
when oc\, a 2 depend on ^> 2 as in ([3]). 

Nevertheless, as it is not realistic to consider that the users will seek the global optimum, 
we show how to find the policy that corresponds to the Bayes-Stackelberg equilibrium where 
the BS tries to maximize the probability that both players use 3G LTE network U bs just by 
choosing the CQI thresholds, knowing that users will try to maximize their individual utility. 

Definition 2 (Bayes-Stackelberg equilibrium). By denoting (^i BSE , ty 2 BSE ) the strategy 
profile of the BS at the Bayes-Stackelberg equilibrium (BSE), this definition translates math- 
ematically as 

* 2 BSE ) = arg max U BS (P BNE (^u * 2 ), *i, * 2 ), (14) 

where ~P BNE (^> \, ^ 2 ) denotes the Bayes-Nash equilibrium in the game of the previous section 
with CQI thresholds equal to ^! 2 . 

C. Performance Evaluation 

We next exemplify our general analysis by investigating the possibility of considering three 
scenarios for the choice of and \P 2 : 

1) Fully cooperative model - the base station chooses both ^^s and the policies for the 
players, aiming to maximize the probability that both players use 3G LTE in the second 
stage. Formally, the fully cooperative strategy is the one satisfying 

(^! C ,^ 2 C ,P C ) =arg max U BS (P, *i, tf 2 ), 

2) Stackelberg model - there are two stages: at the first one the base station chooses both 
\l/jS given the information about the distributions of (hi, bi) aiming to maximize the 
probability that both players use 3G LTE at the second stage, when players play the 
game from the last section. The proposed approach can be seen as intermediate scheme 
between the fully cooperative model and the fully non-cooperative model, 

3) Fully non-cooperative model - the game has two stages: at the first one, players choose 
their ^jS given the information they have about the distributions of (hi,bi) aiming to 
maximize their expected throughput at the second stage; at the second stage they choose 
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a policy depending on actual (s i; foj) as in the model of the last section. Formally, the 
fully cooperative strategy is the one satisfying 

= argmaxE[ Ul (s u P BNE (y u ^ c ))}; for % = 1,2 

Below, we analyze the behavior of the base station and the players at the equilibria of 
each of these models. 

Proposition 3. 

1) In the fully cooperative model, the base station chooses some = \J/ 2 = and CC 
policies for both users. 

2) In the Stackelberg model, when^C^Q^oo) < v for i — 1, 2 then the base station chooses 
any ^ 1 < and ^ 2 < ^2* with W satisfyin^C}, c {l){^\*) = v and C£ C (1)0**) = 
v and then users both play CC. 

When Cq C (oo) > v for i = 1 or i = 2, then the base station chooses ip*** f or { — i ; 2 
maximizing either 

(1 _ e -*i*i)(l _ e -Aa*a) 

subject to C l cw (0)(^i, \I/ 2 ) < u z = 1, 2 or maximizing 

1 - 

subject to Q 7C (0)( 1 I / i, ^2) < u ««d ^cwv-QO^i) ^2) < w - ^ w the first case, both players 
choose CW in the second stage. In the second case, user i chooses CW and user j 
chooses CC. 

3) In the fully non-cooperative model, the players in equilibrium choose $1 = ^\ and 
^2 = ^2 satisfying 

c x cw {n) = v = <? cw {n) as) 

and then both use a CW policy. 

For the clarity of the exposition, proofs are given in the Appendix. What we see in this 
proposition is that when the BS can decide on the behavior of the users, it chooses not to 
disclose any additional information to them by giving = \I/ 2 = and forces them to 

'Recall definition 1 11 1, 

2 Here and in the sequel Cf.(s)(^fi, ^2) denotes the respective Cl(s) when the values of ^s have the given value. 
3 Of course the one of the two with the higher objective function is chosen - its value is the BS utility at equilibrium. 
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use 3G LTE. In other cases (when users can decide on their behavior, but are given only 
partial information), the users' interest is to choose the CQI thresholds somewhere in the 
middle of the channel gain range. This can be seen as a desired trade-off between the global 
network performance at the equilibrium and the individual efficiency of all the users. On 
the other hand, the BS has an incentive to choose CQI thresholds either very low (first case 
in the Stackelberg scenario) or very high (the second case). Both these choices give little 
information for the user about actual channel condition, which is precisely what he wants 
to avoid. It is interesting and somewhat surprising that the optimal policy of the BS in the 
Stackelberg game can be both giving high or low values of CQI thresholds. This can however 
be explained when we understand the meaning of these two situations - very low value of 
the threshold means that no information about the channel state is given. In this case, when 
both users connect to 3G LTE, this corresponds to the choice of the BS. Now, if in the "no 
information" case players choose WiFi, then the base station tries to divide the range of hi 
into a small (in terms of probability) part when the players use WiFi and, a large one when 
they use 3G LTE. This is done by giving the highest possible CQI threshold below which 
the players would have an incentive to use rather 3G LTE than WiFi. This explains why the 
BS has an incentive to choose CQI thresholds very high in this case. 

The final two results of this section are given without proofs, which are straightforward. 

Corollary 1. Note that the maximum network utility, obtained in scenario 1) is equal to 1. 
Obviously the utilities obtained in the other two scenarios always satisfy 

l>U BS (P BSE ^f SE ^f SE )> (16) 
U BS (P NC , *f o *f O) > 

The important fact the corollary implies is that the users, if choosing thresholds optimal 
from their point of view, never choose the same way the base station would, but their interests 
are not contradicting. Note that by Proposition [2] there exists a choice of thresholds and 
^ 2 which gives a zero utility for the base station. The users never have incentive to make 
such a choice. 



In the second corollary, we give the method to compute the price of anarchy (PoA) for our 
model. The PoA measures how good the system performance is when users play selfishly 
and reach the NE instead of playing to achieve the social optimum lfI71lfT8l . Note that as 
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the maximum network utility which can be obtained is 1, the price of anarchy when players 
use strategy profile P is 

p A — 

£/Bs(P,tti,tt 2 )' 

Thus, Proposition [3] implies that 

Corollary 2. The price of anarchy in the Stackelberg model equals 1 whenever C l cc {oo) < v 
for i = 1,2. When for some i, C l cc (oo) > v, then the price of anarchy is equal to the smaller 
of the two values: 

1 

min 

c^ w (o)(^ 2 )<v,k=i,2 (1 - e- A i*i)(l - e ~ x ^) ' 

1 

min 



c*, c (o)(*i,* a )<«,ci*, wr (i)(* 1 ,*2)<« 1 - e Al * 1 



In fully non-cooperative model, 

PoA ' 



;i - e ->i*I)(l -e~ x ^) 



where #2 satisfy (15) 



The above corollary is just a rewriting of the Proposition [3] using different language. It 
is however important to see that the algorithms of finding equilibria for all the hierarchical 
models considered give us not only the equilibrium strategies, but also the tools to evaluate 
the performance of the network in each of these situations. 

IV. The multi-user case 

Now let us consider the case where instead of two we have n users choosing to connect 
either to WiFi or to 3G LTE network. Again we assume that the information about the channel 
quality that user i possesses is limited to that about the distributions of states (sj, 6j) of each 
of the players (including i), that is about otj (or \f) and (3j and to exact information about 
his own current state (s^ b{) (but not about exact value of hi). Our additional assumption 
about the model considered in this section is that the model is symmetric, that is all the 
values [3i, Aj and ^ defining it, are the same for each of the players (and equal to (3, A and 
^ respectively). This significantly simplifies the notation without any serious limitation of 
generality (we believe that some counterparts of all our results will be true also for asymmetric 
model). 
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To define the utilities of the players first let us redefine throughput for each system: 

Thpf = log ( 1 + 2 J^\ . ) (17) 

Thp c = v (18) 

Again we assume that each of the players uses one of the three policies WW,CW,CC, 
where first letter stands for a player's action when his channel is bad, and the second one 
when his channel is good. As it is troublesome to write down the policies for each of n 
players, we will make use of the fact that the game is symmetric, writing instead of the 
policy profile a policy statistics K = (k ww , k C w) with k ww denoting the number of players 
applying policy WW and K cw - of players applying CW. Of course the number of those 
using policy CC is n — k ww — kcw, so we will omit it. Given K, we can define user i's 
utility in state s = 0, 1 a^j] 

{v: if user i chooses C at state s, 

(19) 
C| c _.(s); if user i chooses W at state s 

where the functions _., describing the utility of player i using WiFi when his opponents 
use policies described by K, are similarly as for the two-user case: 

CL 4 (1) =n<kjh i )\h i >^\ (20) 
C 1 K J0) =E[4_,(/k)|/ii<*] (21) 



1 - OCi J 



[ '4 Xh t )x ie - XA dh t . 

Jo 



Next, the functions c l K _ , defining utility of player i using W when channel gain is hi against 



4 Notation K_i used below denotes policy statistics defined as in the two-user case but without policy of user i. 



16 



policies K of his opponents, can be written a: 



fcl k' 2 q 




) 



fel+fc2— 9 



(22) 



r=0 g=0 v=0 




) 



/ 



[i/),oo) 1, xM"-9- 1 x[[l,*)9 



e 




Proposition 4. 77ze symmetric n-user game considered in the paper always has a pure- 
strategy Bayes-Nash equilibrium of one of five types: 

(a) When CL j(l) < v then the profile where all the players use policy CC is an equilibrium. 

(b) When CL k _-q(l) > v > C? u(l) then any profile where k players use policy CW and 
all the others play CC is an equilibrium. 

(c) When CJ Qn _^(l) > v and C* 0n _^(0) < v then the profile where all the players use 
policy CW is an equilibrium. 

(d) When Cl_ k _^(l) > v and C* k _ 1)n _ k ,(0) > v > Cy kn _ k ^) then any profile where k 
players apply policy WW and remaining n — k players use policy CW is an equilibrium. 

(e) When Cf OT1 _i](0) > v then the profile where all the players use policy WW is an 
equilibrium. 

It may also have another pure-strategy Bayes-Nash equilibrium with k players using WW 
and I < n — k using CW when C^ kl _^(l) > v > Cf k „(l) and C[ i fc _ 1 ^(0) > v > C[ k J0). 

We give two corollaries of this proposition. The first one considering two-user games 
discussed before is immediate. 

Corollary 3. Whenever the two-user symmetric game has an equilibrium of the form (CC, WW) 
or (WW, CC), it also has another pure equilibrium, where in at least one state both players 
use the same action. 

The second corollary gives a kind of consistency property for equilibria in games for 
different values of n. 

s Of course this formula is a generalization of the formulas for c\ given in section [ill] and it applies for any n > 2, in 
particular c r cc = c| j, c r cw = c| x j and c % ww = j when n = 2 and players are symmetric. 



17 

Corollary 4. Suppose that 

(a) a profile where k players apply policy CW and all the other players use policy CC is 
an equilibrium in n-user symmetric game. Then it is also an equilibrium in any m-user 
game defined with the same parameters (3, A and \I/ and m > k. 

(b) a profile where k players use policy WW, I players use policy CW and k + I < n is 
an equilibrium in n-user symmetric game. Then it is also an equilibrium in any m-user 
game defined with the same parameters (3, A and X V and m > k + I. 

Moreover for any parameters (3, A and \I/ there exists an n such that for any m > n any 
profile with n players applying policy CW and m — n using policy CC is an equilibrium in 
m-user game. 

Proof: Note that C[ ki fca i(s) does not depend on the number of players in the game n, 
only on the number of those who use one of the policies CW or WW. Just this implies 
parts (a) and (b). The final part is due to the fact that C^ 0n _^(l) — > as n — > oo. 

■ 

The next proposition generalizes the results for hierarchial model included in Proposition 
[3] for n-user symmetric games. We only consider scenarios 1) and 2) discussed there, as it 
is difficult to apply scenario 3) to the symmetric model. 

Proposition 5. 

1) In the fully cooperative model, the base station chooses \1> = and CC policies for all 
the users. 

2) In the Stackelberg model the base station computes for every k < n such tha^\ 

POO 

0^(00):= / c\ k _ lfl] (h)\e- Xh dh>v 
Jo 

ty(k) such that 

q , fc -i](o)(*(AO) = ^ 

Then for each such k it computes 

P(k) = (1 - e - A *W) fe 

and chooses k* with the biggest value of P{k) (which equals the BS utility at equilib- 
rium). The choice of^(k*) at the first stage and any profile of policies where k* players 
use policy CW and all the remaining ones play CC will then be an equilibrium. 

6 There will always be only a finite number of k satisfying this inequality, as J °° Cy k _ 1 ^(h)\e~ xh dh — > for k — > oo. 
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If even CLJoo) < v, then at the equilibrium the base station chooses any ^ such that 
CK _ 10 i(l)(^ r ) < v and all the players use policy CC. 

We give one corollary to this proposition. 

Corollary 5. The price of anarchy in the n-user Stackelberg model can be computed as 

PoA 1 



(where the base station utility Ubs zs generalized from the two-user case as the probability 
that all the users choose rather 3G LTE than WiFi), and is either equal to 1, when CL(oo) < 
v, or satisfies 

PoA = min 



k<n,k<n* P(k) 

with n* = maxjfc : Cl^oo) > v}, when CL(oo) > v. For any given values of (3 and \, 
the price of anarchy is a non-increasing function of n and constant for n > n*. 

The first part of this corollary is again just a rewriting of the results from Proposition [5] 
with the stress made on network utilities rather than strategies of the players. It shows that 
exactly the same procedure, used to find the equilibrium policies, can be applied to evaluate 
the performance of the network. 

The second pert of the corollary (the properties of the PoA) is a consequence of the fact 
that adding each new player to the game gives the BS more patterns of behavior of the 
users which can be stimulated by a proper choice of This results in the decrease in the 
PoA up to the threshold number of players n* from which no new player in the game is 
interested in using WiFi network, because for such a large number of players it would be 
too slow, regardless of how good the channel would be. This result may seem surprising 
at first glance, as usually a bigger number of players means more anarchy. However, if we 
look at the objective function of the BS, which is the probability of all the players using 
3G LTE network, we clearly see that a bigger number of players is disadvantageous for the 
WiFi network which may get congested and favorable for 3G LTE which cannot. 

If someone is interested not in finding the equilibria for all the numbers of players, but only 
in the limit behavior and the limit value of PoA without checking the values of CL^oo) 
for every k, he may, instead of computing n* given above compute an upper bound given 
below. 



19 



Corollary 6. The value n* appearing above can be bounded from above by n** = max{| + 
l,k*} with k* satisfying 

This last bound cannot be given in a closed form, but it can be computed much faster than 

n*. 

V. Conclusion 

We have proposed a hierarchical association method that combines benefits from both 
decentralized and centralized design in which the network operator optimizes its global utility 
while users maximize their individual utilities. The users' decision making is based on partial 
information that is signaled to the mobiles by the base station. A central design aspect is then 
for the base stations to decide how to aggregate information which then determines what to 
signal to the users. In this setting, we have shown that, in order to maximize its revenue, 
the network operator rather than increasing its offered throughput (which is costly) has an 
incentive to choose channel quality indicator thresholds either very low or very high. This 
may make the information given to the user when attempting to connect misleading since 
the throughput of a user cannot be directly inferred from the quality of his channel but also 
depends on the channel quality indicator thresholds the base station broadcasts. In particular, 
there may be different equilibria (so different outcomes) depending on what information 
(channel quality indicator thresholds) the base station broadcasts to users. 

The proposed approach provides a reasonable trade-off between centralized vs decentral- 
ized optimization in terms of the signaling overhead and the resulting network throughput 
performance. 

Appendix 

A. Proof of Proposition \3\ 
Proof: 

1) is obvious and needs no explanation. 

2) Since when ^ -> and ^ 2 -> 0, C£ c (l)(#i, * 2 ) -> C % cc (oo) for % = 1,2, when 
C l cc (oo) < v, then for \Pj small enough (in the worst case, equal to 0) also C l cc (l)(^i, ^ 2 ) < 
v for i = 1,2. But this means that (CC, CC) is an equilibrium in the game at the second 
stage. Thus whenever ^ < and ^ 2 < ^2* with ^** satisfying C^ c (l)(*f , ^ 2 *) = v, the 
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outcome of the Stackelberg game is that both players use 3G LTE with probability 1, which 
is the biggest value possible of the base station's utility. Now suppose that C l cc (oo) > v. 
Then even for the value of ^ equal to 0, playing (CC, CC) is not an equilibrium in the 
game of the second stage. Thus, to maximize the probability of both players using 3G LTE, 
the base station has to choose the and *ffj in such a way that the equilibrium in the game 
of the second stage was either (CW, CW) or (CW, CC) and that the probability that the 
state of players using policy CW is (hi < is the highest possible. This is done by 
solving the optimization problems defined in the proposition (the first problem for the case 
(CW,CW), the second one for (CW,CC)). 



3) First note that whenever ^ and ^ are chosen as in ( 15 ), (CW, CW) is an equilib- 
rium. This is because C l cw (ti) is a conditional expectation of c % cw (hi) over the set H_ : = 
{c l cw (hi) < v}, so it is definitely smaller than v. Similarly, C l cw (l) is a conditional 
expectation of c l cw (hi) over the set H + := {c % cw (hi) > v}, so it is bigger than v. Thus 
the condition for (CW, CW) to be an equilibrium is definitely satisfied. 

Now note that whenever player i chooses < \&* at the first stage, but continues to use 
policy CW in the second, he loses 

*: 

— Xihi 



(v - dc^h^Ke-^ > 0. 



9i 



Similarly, when he chooses > \&* he loses 

r\c i cw (h l )-v)\ l e- x ^>0. 

On the other hand, when he changes both the and the policy at the second stage, his 
utility is either v (when he plays CC) or E[c* CVK (/ii)] (when he uses policy WW), which are 
clearly both less than his current utility 



P(hi G H.)v + P{hi E H+^c^h^hi E H 



so (^*, i s an equilibrium choice. 



The last thing we need to show is that there are ^* and ^ satisfying (15). Note however 



that if we construct function^ ty^j) := {^j : d cw (^> i)(^> j) = v}. It is immediate to see 
that since all the functions c\ are nondecreasing, ^ are non-increasing functions from [0, 00) 



7 Here we use a convention that in the second bracket we give the value of player j's threshold, which appears in the 
definitions of c\, but was omitted so far. 
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to itself. It is then obvious that the graphs of these two functions: {(^i, ^2) : ^1 = ^1(^2)} 



and {(^1,^2) : ^2 = ^2(^1)} intersect, and thus (15) has a solution 



B. Proof of Proposition [?] 

Before we prove Proposition |4} we need an auxiliary lemma. 

Lemma 1. The functions C* k ^(s)(\l/) are decreasing in k, I and increasing in ^ for any 
s = 0,1. 



Proof: First note that 

Jh 



A' 



n-l 



[ip,oo) v xR n -<]- 1 x[0,y)i- v 



l0g(1 + ^%^ j )e ' XTjlZlh3dhl ■ ■ ■ dhn ~ l 



is decreasing in r. 

On the other hand F(r + 1, q) > F(r, q + 1), as 



[V',oo)"xR n -9- 2 x[0,>I')«- t '+ 1 



A"-Mog(l + 



ph 



)e- x ^ h >dh 1 ...dh n _ 1 



t( 9 )[ ^ 



log(l + 



^^^ ^ 



+ / log(l 



ph 



e 2^=1 j dhi . . . dh r+v dh r+v+2 . . . dh n _i 



> 



, n-2 



/■oo 

/ log(l + - 



ph 



)\e~ Xhr+v+1 dh r+v+1 



e a ^j=i hj dhi . . . dh r+v dh r+v+2 ■ ■ ■ dh n _ x = ( q ) / 

£0 w A 



A 



n-l 



' [V , ,oo) 7 ' xl"-?" 1 x [0,*)'?- 1 ' 



log(l 



ph 



)e~ x ^=l h idh 1 ...dh n - 1 



= F(r + l,q) 
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But this implies that F is also decreasing in q. 

Next note that c l , k Jh) is the expected value of ^g =0J 5 9 (l ~~ P) l ~ 9 ( l )F( r i q) when r is a 
random value with the binomial distribution Bin(A;, (3). As distribution Bin(/c + 1, (3) strictly 
stochastically dominates Bin(A;,/3), the expected value with respect to Bin(/c + 1,(3) of any 
decreasing function is smaller than that with respect to Bin(A;, (3) and thus 

c\k+i,i]( h ) < c \k,i](h) 

for any h > 0. But, as C* fci j(0) and (7^(1) are conditional expectations of c^ k q(h) over 
some fixed sets, this immediately implies that they are both decreasing in k. The fact that 
they are decreasing in I is proved analogously - the only difference is that the monotonicity 
of F in q (instead of the monotonicity in r) is used. 

To prove the last part of the lemma take ^ < ^ 2 and define for any q and a E {0,1, 2} q 

< hj < if OLj = 0, < hj < ^ 2 if OLj = 1, 
^ 2 < hj if aj = 2}. 

Note that R n_1 = \J a S a (^ 1 ,^ 2 )- Next note that F(r,g)(#i) is the integral over R n_1 of 
the function fi defined on each S^^i, ^ 2 ) separately, as 

M1+ - P * . )X^e~^^. 
° +Pz2j>ihj 

On the other hand F(r,q)(^ 2 ) is the integral over W 1 ^ 1 of the function f 2 defined on each 
S a (^f 1 ,^f 2 ) separately, as 

Clearly f\ < f 2 , and so F(r, q)(^i) < F(r, q)(^ 2 ) for any r and q. This immediately implies 
that also Cy k ^ (/i) (^i) < c^ kl ^(h)(^f 2 ). However, note that since cj fe q(/i) are also increasing in 
h: Similarly 

q M (o)(^) =e[c[ m (/i)(* 1 )|/i<* 1 ] 

<E[ C J M (/ i )(* 1 )l^<*2] 
<E[c( M (/i)(* 2 )|/i<* 2 ] 
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which ends the proof of lemma. 

■ 

Now we are able to prove Proposition |4j 

Proof: First note that it is clear from the definition of c % , kl ,(h) that it is increasing in h 
and thus 

Cf M (0) < Cf M (l) (23) 
for any values of k and I. Next it is enough to check the definition of Bayes-Nash equilibrium 



(inferring (23) if needed) that the sets of inequalities appearing in the proposition define 
respective equilibria. What is left to show is that cases (a-e) cover all the possible situations. 
Suppose that none of the cases (a) and (b) holds. Then, since C[ l 00 ](l) > v, by Lemma Tj 
also C| 0n _!](l) > v. However, (c), (d) and (e) cover all the possible cases then. ■ 

C. Proof of Proposition [3] 

Proof- 
Part 1) is obvious. 2) Since when ->■ 0, C| 00 ](l)(\l/) -> CL(oo), then when CjL(oo) < v, 
for \l/ small enough (in the worst case, equal to 0) also CL j(l)(\I/) < v, which this means 
that all the players apply policy CC in equilibrium at the second stage of the game. Thus 
whenever ^ is small enough, the outcome of the Stackelberg game is that all the players use 
3G LTE with probability 1, which is the biggest value possible of the base station's utility. 

Now suppose that CL(oo) > v. Then even for the value of ^ equal to 0, not every player 
uses policy CC at the equilibrium of the game of the second stage. Thus, to maximize the 
probability of both players using 3G LTE, the base station has to choose the ^ in such a 
way that at the equilibrium of the game of the second stage all of the players applied either 
policy CC or CW and that the probability that the state of players using policy CW is 
(hi < \E0 is the highest possible. This is done by solving the optimization problems of finding 
^ maximizing (1 — e~ A *) fc with respect to C* fc _ 1 ](0)(\E') < v. However, as by Lemma Tj 
CL (0) (ty) is an increasing function of $ for any fixed k, this maximum is achieved for 
^ satisfying CL k _^(0)(^) = v, which ends the proof. ■ 

D. Proof of Corollary [5] 

Proof- 
Let us assume that 

k > | + 1. (24) 
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First note that C^^oo) is 



Efc-r-f; 1 )"^ + ^rfc-M <25) 

where ft, and /ii, . . . , hk-i are independent exponentially distributed random values with 
common parameter A. 

Next let r* = ^ fc ~ 1 - >/3 . Now (25) can be rewritten as 



ph 



j2 - ?r i - r ( k x ) E[io g (i + - ; - )] 

+ ^/3 r (l-/3) fc - 1 -f A; ~ 1 ')E[log(l+ , . , 



r>r 

ph 



Since the function E[log(l + 0.2 +p |^ — ^-)] is clearly positive decreasing, the first element of 
this sum can be bounded from above by 

Prob[r < r*]£[log(l + ^)], 

where Prob[r < r*] is the probability that a random value with binomial distribution Bin(k — 
1,(3) is smaller than r*. This probability, using Hoeffding's inequality |fT9ll can be bounded 

-, (fc-l)/3 2 -, (k-l)/3 2 i 

above by a and thus the whole term by 2 _E[log(l + ^jj. 

Analogously, the second element of the sum can be bounded from above by 

Prob[r > r*]E[log(l + P ^ r , )] 

and further by 

Epog(l + — t— )]. (26) 

Now note that 1 + —t — is a random value with Pareto distribution Il20l Chap. 20, Sec. 
12] with parameters 1 and r*, whose average is (for r* > 1, which is guaranteed by our 



assumption (24)) -J— j-. Since logarithm is a concave function, we can use Jensen's inequality 
to bound ( [26} from above by 

l0g( ^ = 1 ° g( (fc-l ) /3-2 ) - 

This implies that 



qf*-i](oo) < ^e- fc ^^ 00 log(l + ^)e-^ + log( 



'(A; - l)/3 - 2 

and consequently that for any k such that the RHS of the above inequality equals v the LHS 
will be smaller than v and thus k > n*. ■ 
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